Engineering posts about Reinforcement Learning

Curated summaries and key learnings for engineers working with Reinforcement Learning.

Pinterest
14m

Making User-Sequence Data More Cost-Efficient, Faster, and Easier to Use

This article discusses the redesign of a user-sequence platform aimed at improving the efficiency, speed, and usability of user data for machine learning applications. It addresses the challenges...

Salesforce
5m

How Salesforce Built an AI Security Agent for Autonomous Threat Triage

The article outlines how Salesforce developed the SATA agent, an AI-driven system designed to enhance cybersecurity by autonomously triaging threats across complex environments. It highlights the...

Databricks
13m

From manual to autonomous: how AI agents are transforming electric grid operations

The electric utility industry is facing unprecedented operational challenges due to increasing demand and aging infrastructure, necessitating the adoption of AI agents to enhance grid reliability and...

Salesforce
7m

Creating a Multi-Tenant AI Agent Platform Handling 7K+ Sessions Without Cross-Team Interference

The article outlines the development of the Bring Your Own Planner (BYOP), a multi-tenant AI agent platform designed to enhance team autonomy and scalability within Salesforce. It addresses the...

Microsoft
4m

The JavaScript AI Build-a-thon Season 2 starts today!

The JavaScript AI Build-a-thon is a comprehensive program aimed at bridging the gap in AI development for JavaScript and TypeScript developers. Spanning four weeks, the event includes self-paced...

Meta (Facebook)
1m

Reel Friends: Building Social Discovery that Scales to Billions

In the Meta Tech Podcast episode featuring Pascal Hartig, the engineering intricacies behind the 'Friend Bubbles' feature of Facebook Reels are explored. The discussion highlights the evolution of...

Google
13m

Build Long-running AI agents that pause, resume, and never lose context with ADK

This article presents a comprehensive guide to building long-running AI agents that can pause, resume, and maintain context using the Agent Development Kit (ADK). It highlights the limitations of...

Pinterest
6m

Enhancing Ad Relevance: Integrating Real-Time Context into Sequential Recommender Models

The article presents a novel approach to enhancing ad relevance by integrating real-time context into sequential recommender models. It highlights the limitations of previous models that relied...

Apple
3m

PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning

The article introduces PORTool, an importance-aware policy optimization algorithm designed for multi-tool-integrated reasoning in large language model (LLM) empowered agents. It addresses the...

Apple
3m

Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents

The article introduces the concept of a Reinforced Agent that enhances tool-calling agents by incorporating inference-time feedback. This approach aims to address the limitations of traditional...

Salesforce
6m

How AI-Driven Kubernetes Optimization Reclaimed Millions from 47% Idle Capacity

The article discusses Salesforce's challenges with infrastructure scaling on its Hyperforce platform, particularly regarding over-provisioning and idle capacity in Kubernetes services. It introduces...

Apple
3m

DSO: Direct Steering Optimization for Bias Mitigation

The article presents Direct Steering Optimization (DSO), a novel approach aimed at mitigating bias in vision-language models (VLMs) and large language models (LLMs). It highlights the challenges...

Pinterest
9m

From Clicks to Conversions: Architecting Shopping Conversion Candidate Generation at Pinterest

The article discusses Pinterest's development of a shopping conversion candidate generation model aimed at optimizing offsite conversion events, which are typically sparse and noisy. It details the...

DigitalOcean
9m

Beyond the Abyss Project Poseidon’s Quest for Zero-Downtime Reliability

The article outlines the development of Project Poseidon, a predictive monitoring system designed to enhance reliability in large-scale cloud environments by leveraging machine learning and...

Apple
11m

ParaRNN: Large-Scale Nonlinear RNNs, Trainable in Parallel

The article presents ParaRNN, a novel framework developed by Apple researchers that significantly enhances the training efficiency of Recurrent Neural Networks (RNNs) by enabling parallelization....

Databricks
7m

How to transform document activation workflows with Genie and Agent Bricks

The article outlines the challenges organizations face in managing document workflows, emphasizing the need for a unified data foundation to leverage AI effectively. It introduces Databricks'...

Databricks
8m

Are LLM agents good at join order optimization?

This article explores the innovative application of large language models (LLMs) in improving join order optimization in SQL queries, a long-standing challenge in database management. Traditional...

Google
5m

Production-Ready AI Agents: 5 Lessons from Refactoring a Monolith

The article outlines the challenges of developing production-ready AI agents, particularly focusing on the transition from monolithic architectures to orchestrated sub-agents. It details a case study...

Cloudflare
17m

Agents that remember: introducing Agent Memory

The article introduces Agent Memory, a managed service designed to enhance AI agents by providing them with persistent memory capabilities. This service addresses the challenge of context management...

Apple
19m

International Conference on Learning Representations (ICLR) 2026

The International Conference on Learning Representations (ICLR) 2026 showcases significant advancements in deep learning research, with Apple presenting multiple papers and technical demos. The...